A heuristic-based fuzzy co-clustering algorithm for categorization of high-dimensional data

نویسندگان

  • William-Chandra Tjhi
  • Lihui Chen
چکیده

Fuzzy co-clustering is a technique that performs simultaneous fuzzy clustering of objects and features. It is known to be suitable for categorizing high-dimensional data, due to its dynamic dimensionality reduction mechanism achieved through simultaneous feature clustering. We introduce a new fuzzy co-clustering algorithm called Heuristic Fuzzy Co-clustering with the Ruspini’s condition (HFCR), which addresses several issues in some prominent existing fuzzy co-clustering algorithms. Among these issues are the performance on data sets with overlapping feature clusters and the unnatural representation of feature clusters. The key idea behind HFCR is the formulation of the dual-partitioning approach for fuzzy co-clustering, replacing the existing partitioning-ranking approach. HFCR adopts an efficient and practical heuristic method that can be shown to be more robust than our earlier effort for the dual-partitioning approach. We explain the proposed algorithm in details and provide an analytical study on its advantages. Experimental results on 10 large benchmark document data sets confirm the effectiveness of the new algorithm. © 2007 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High-Dimensional Unsupervised Active Learning Method

In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...

متن کامل

A Multi-Objective Approach to Fuzzy Clustering using ITLBO Algorithm

Data clustering is one of the most important areas of research in data mining and knowledge discovery. Recent research in this area has shown that the best clustering results can be achieved using multi-objective methods. In other words, assuming more than one criterion as objective functions for clustering data can measurably increase the quality of clustering. In this study, a model with two ...

متن کامل

A New Fuzzy Co-clustering Algorithm for Categorization of Datasets with Overlapping Clusters

Fuzzy co-clustering is a method that performs simultaneous fuzzy clustering of objects and features. In this paper, we introduce a new fuzzy coclustering algorithm for high-dimensional datasets called Cosine-Distancebased & Dual-partitioning Fuzzy Co-clustering (CODIALING FCC). Unlike many existing fuzzy co-clustering algorithms, CODIALING FCC is a dualpartitioning algorithm. It clusters the fe...

متن کامل

A Fuzzy C-means Algorithm for Clustering Fuzzy Data and Its Application in Clustering Incomplete Data

The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is pres...

متن کامل

Solving Data Clustering Problems using Chaos Embedded Cat Swarm Optimization

In this paper, a new method is proposed for solving the data clustering problem using Cat Swarm Optimization (CSO) algorithm based on chaotic behavior. The problem of data clustering is an important section in the field of the data mining, which has always been noted by researchers and experts in data mining for its numerous applications in solving real-world problems. The CSO algorithm is one ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Fuzzy Sets and Systems

دوره 159  شماره 

صفحات  -

تاریخ انتشار 2008